Understanding Complex Visually Referring Utterances

نویسندگان

Peter Gorniak

Deb Roy

چکیده

We propose a computational model of visually-grounded spatial language understanding, based on a study of how people verbally describe objects in visual scenes. We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework. The implemented system selects the correct referents in response to a broad range of referring expressions for a large percentage of test cases. In an analysis of the system’s successes and failures we reveal how visual context influences the semantics of utterances and propose future extensions to the model that take such context into account.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grounded Semantic Composition for Visual Scenes Grounded Semantic Composition for Visual Scenes

We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementatio...

متن کامل

Grounded Semantic Composition for Visual Scenes DRAFT - Do not cite

We present a study on how people verbally describe objects in visual scenes. The emphasis of our analysis lies on the combination of individual word meanings to produce meanings for complex referring expressions. Based on this study, we propose a computational model of visually-grounded spatial language understanding. We have implemented the model, and it is able to understand a broad range of ...

متن کامل

Grounded Semantic Composition for Visual Scenes

متن کامل

Coordinating Understanding and Generation in an Abductive Approach to Interpretation

We use a dynamic, context-sensitive approach to abductive interpretation to describe coordinated processes of understanding, generation and accommodation in dialogue. The agent updates the dialogue uniformly for its own and its interlocutors’ utterances, by accommodating a new context, inferred abductively, in which utterance content is both true and prominent. The generator plans natural and c...

متن کامل

Evaluation of the Scusi? Spoken Language Interpretation System - A Case Study

We present a performance evaluation framework for Spoken Language Understanding (SLU) modules, focusing on three elements: (1) characterization of spoken utterances, (2) experimental design, and (3) quantitative evaluation metrics. We then describe the application of our framework to Scusi?— our SLU system that focuses on referring expressions.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Understanding Complex Visually Referring Utterances

نویسندگان

چکیده

منابع مشابه

Grounded Semantic Composition for Visual Scenes Grounded Semantic Composition for Visual Scenes

Grounded Semantic Composition for Visual Scenes DRAFT - Do not cite

Grounded Semantic Composition for Visual Scenes

Coordinating Understanding and Generation in an Abductive Approach to Interpretation

Evaluation of the Scusi? Spoken Language Interpretation System - A Case Study

عنوان ژورنال:

اشتراک گذاری